Efficient Parallel Execution for “Un-parallelizable” Codes via Coarse-Grain Speculation

نویسنده

Hari K. Pyla

چکیده

As the number of cores in modern processor architectures keeps growing, programmers must use explicit parallelism to improve performance. Alas, a large body of extant codes are intrinsically unsuitable for mainstream parallelization techniques, due to the execution order constraints imposed by their data and control dependencies. Therefore, realizing the very potential of many-core hinges on our ability to parallelize these so called un-parallelizable codes. This research solves the challenge of enabling efficient parallel execution of such applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hardware Support for Data Dependence Speculation in Distributed Shared-Memory Multiprocessors Via Cache-block Reconciliation

Data dependence speculation allows a compiler to relax the constraint of data-independence to issue tasks in parallel, increasing the potential for automatic extraction of parallelism from sequential programs. This paper proposes hardware mechanisms to support a data-dependence speculative distributed shared-memory (DDSM) architecture that enable speculative parallelization of programs with irr...

متن کامل

Quantitative Analysis of Data ow Program Execution { Preliminaries to a

While the dataaow execution model can potentially uncover all forms and levels of parallelism in a program, in its traditional ne-grain form, it does not exploit any form of locality. Recent evidence indicates that the exploitation of locality in dataaow programs could have a dramatic impact on performance. The current trend in the design of dataaow processors suggest a synthesis of traditional...

متن کامل

Capsules: expressing composable computations in a parallel programming model

A well-known problem in designing high-level parallel programming models and languages is the “granularity problem”, where the execution of parallel task instances that are too fine-grain incur large overheads in the parallel runtime and decrease the speed-up achieved by parallel execution. On the other hand, tasks that are too coarse-grain create load-imbalance and do not adequately utilize th...

متن کامل

Second - level Instruction Cache Thread Processing Unit Thread Processing Unit Thread Processing Unit Instruction Cache First - level First - level First - level Instruction Cache Instruction Cache Execution

This paper presents a new parallelization model, called coarse-grained thread pipelining, for exploiting speculative coarse-grained parallelism from general-purpose application programs in shared-memory multiprocessor systems. This parallelization model, which is based on the ne-grained thread pipelining model proposed for the superthreaded architecture 11, 12], allows concurrent execution of l...

متن کامل

An Analysis of Latency in Data

Recent evidence indicates that the exploitation of locality in dataaow programs could have a dramatic impact on performance. The current trend in the design of dataaow processors suggest a synthesis of traditional non-strict ne grain instruction execution and a strict coarse grain execution in order to exploit locality. While an increase in instruction granularity will favor the exploitation of...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Efficient Parallel Execution for “Un-parallelizable” Codes via Coarse-Grain Speculation

نویسنده

چکیده

منابع مشابه

Hardware Support for Data Dependence Speculation in Distributed Shared-Memory Multiprocessors Via Cache-block Reconciliation

Quantitative Analysis of Data ow Program Execution { Preliminaries to a

Capsules: expressing composable computations in a parallel programming model

Second - level Instruction Cache Thread Processing Unit Thread Processing Unit Thread Processing Unit Instruction Cache First - level First - level First - level Instruction Cache Instruction Cache Execution

An Analysis of Latency in Data

عنوان ژورنال:

اشتراک گذاری